D.20 STATISTICA
Approximate Cost: $1,195
Source: StatSoft (www.statsoft.com)
Current Version: STATISTICA 12
Operating System Needs: Windows 7 (recommended), Windows Vista, Windows XP
Input Structure: Can directly open spreadsheet, text, and database files
Overview
STATISTICA is user friendly, while still allowing for significant customization and functionality. The base package computes practically all common descriptive statistics and can produce a wide variety of customizable graphics. The base software includes graphics tools along with the following modules:
- descriptive statistics, breakdowns, and exploratory data analysis
- correlations
- basic statistics from results spreadsheets (tables)
- interactive probability calculator
- t-tests (and other tests of group differences)
- frequency tables, cross-tabulation tables, stub-and-banner tables, multiple response analysis
- multiple regression methods
- nonparametricStatistical test that does not depend on knowledge of the distribution of the sampled population (Unified Guidance). statistics
- one-way analysis of variance (ANOVA)A statistical method for identifying differences among several population means or medians./multivariate ANOVA (MANOVA)
- distribution fitting
STATISTICA Query can be used to easily access data from databases using Microsoft's OLE DB conventions and allows easy statistical analysis of large and changing databases. STATISTICA interacts directly with the free software package R, allowing users to have additional features not present in the base software, while maintaining the customization and ease of use of STATISTICA.
|
Statistical Method |
Capability As Is |
Capability with Scripts/Add-Ins |
|---|---|---|
|
Handling of NDs |
|
|
|
● |
● |
|
|
|
● |
|
|
|
|
|
|
|
◒ |
|
|
Exploratory/Diagnostic Tools |
|
|
|
Summary Statistics |
● |
● |
|
● |
● |
|
|
● |
● |
|
|
Data transformations |
● |
● |
|
Statistical Design |
|
|
|
Statistical Power |
● |
● |
|
|
● |
|
|
Contaminant ranking |
● |
● |
|
|
|
|
|
Statistical Limits |
|
|
|
● |
● |
|
|
● |
● |
|
|
● |
● |
|
|
Testing Compliance Limits |
◒ |
● |
|
Graphics |
|
|
|
Plots/Charts |
● |
● |
|
Batch plots |
● |
● |
|
Tweaking of graphics |
● |
● |
|
Statistical Comparisons |
|
|
|
● |
● |
|
|
● |
● |
|
|
Spatial Analysis |
|
|
|
Geostatistics/Mapping |
|
|
|
|
|
|
|
|
|
|
|
Regression/Time Series |
|
|
|
◒ |
● |
|
|
|
|
|
|
◒ |
● |
|
|
◒ |
● |
|
|
|
|
|
|
◒ |
● |
|
|
Multivariate Analysis |
|
|
|
Multiple regression |
◒ |
● |
|
Factor/Discriminant analysis |
|
● |
|
|
● |
Capability Ratings:
N/A = Not applicable or not available
● = Full capability
◒ = Some capability
(blank cell) = No capability
Add-Ins Available
STATISTICA also has a number of add-in packages and modules that can enhance the functionality of the base software package.
- The Data Mining add-in is a versatile tool that includes techniques for quick analysis of large data files.
- The Market Basket add-in uses sequence, association, and link analysis to build models and extract rules from large data sets.
- The multivariate statistical process control allows users to apply univariate and multivariate statistical methods for quality control, predictive modeling, and data reduction for complex processes; determine and optimize the most critical processes or factors; monitor process characteristics interactively; and build, evaluate, and use predictive models based on historical data.
- The quality control add-in includes a wide selection of quality control analysis techniques and additional quality control chartsGraphical plots of compliance measurements over time; alternative to prediction limits (Unified Guidance)..
- Additional add-ins include advanced linear/nonlinear models, multivariate exploratory techniques, powerSee "statistical power." analysis and interval estimation, and varianceThe square of the standard deviation (EPA 1989); a measure of how far numbers are separated in a data set. A small variance indicates that numbers in the dataset are clustered close to the mean. estimation and precision.
Ease of Use and Data Import
STATISTICA is designed as a user-friendly software package. For additional help with the program, you can watch a wide variety of training videos on the website or take a training seminar for STATISTICA basics or advanced topics.
STATISTICA can directly open many data types including databases, spreadsheets, and text files. Using STATISTICA Query and Visual Basic, you can easily query and import or export data from databases for statistical analysis. Output data sheets and plots can be sent to workbooks, STATISTICA reports, or Microsoft Word. STATISTICA can also coordinate with the free software package R, allowing you to run R scripts and algorithms from STATISTICA and customize outputs and graphics directly in STATISTICA. You can also record a macro of process steps, allowing for easy reproduction and duplication of analysis.
Types of Distributions
STATISTICA contains a distribution fitting option which directly compares the distribution of data to a wide variety of distributions. Available distributions for fitting include normal, rectangular, exponential, gammaA gamma distribution or data set. A parametric unimodal distribution model commonly applied to groundwater data where the data set is left skewed and tied to zero. Very similar to Weibull and lognormal distributions; differences are in their tail behavior, and the gamma density has the second longest tail where its coefficient of variation is less than 1 (Unified Guidance; Gilbert 1987; Silva and Lisboa 2007)., lognormalA dataset that is not normally distributed (symmetric bell-shaped curve) but that can be transformed using a natural logarithm so that the data set can be evaluated using a normal-theory test (Unified Guidance)., chi-squared, Weibull, Compertz, Binomial, Poisson, geometric, or Bernoulli distributions. Once a distribution has been fit, you can evaluate the fit using a variety of tests and plots.
Additional distribution fitting options are available in the STATISTICA Process Analysis add-in, including the option to calculate the maximum-likelihood parameter. The STATISTICA Advanced Linear/Non-Linear Model add-in allows you to fit data to complex, custom-defined functions.
Visualization
STATISTICA has a wide variety of plotting capabilities in both 2-D and 3-D. Plotting options in the base package include box plots, 2-D and 3-D histograms, bivariate distributions, 2-D and 3-D scatter plots, normal, half-normal, and detrended probability plots, quartile-quartile plots, probability-probability plotsGraphical presentation of quantiles or z-scores plotted on the y-axis and, for example, concentration measurement in increasing magnitude plotted on the x-axis. A typical exploratory data analysis tool to identify departures from normality, outliers and skewness (Unified Guidance)., contour plots, nonsmoothed surfaces, and icons. You can zoom in on portions of the graphs, which can be useful when visualizing larger data sets and when producing cross-section slices from 3-D graphics. STATISTICA also has the option of plotting multiple-subset scatter plotsGraphical representation of multiple observations from a single point used to illustrate the relationship between two or more variables. An example would be concentrations of one chemical on the x-axis and a second chemical on the y-axis. They are a typical exploratory data analysis tool to identify linear versus nonlinear relationships between variables (Unified Guidance). and categorized scatter plots. The program provides many options to customize and format figures and tables for reports and presentations.
Primary Uses for Groundwater Data Analysis
The STATISTICA base package and add-ins include a wide variety of customizable graphics, which are ideal for use in groundwater data analysis. These plots can be used to analyze distribution, illustrate general trends, and support conclusions derived from hypothesis testing and descriptive statistics. STATISTICA is also well known for its strong data mining add-in, which allows you to rapidly analyze large data files and data sets.
Benefits
- easy to use and a wide variety of basic and advanced training videos and seminars available
- able to import, store, and export data easily
- advanced user direct interaction with R for additional statistical capabilities
- STATISTICA Query to interact and query directly in databases
- ability to plot and customize a wide variety of graphics
- strong data mining capabilities with the STATISTICA data mining add-in module
- strong and easy-to-use multivariate analytic features not readily found in other statistical software
- macros provided to record steps for easy reproduction of data analysis
Limitations and Data Requirements
- cost
- some statistical tools missing or have limited functionality relative to other statistical software packages.
- requires purchase of add-ins and modules for complete functionality
Publication Date: December 2013